The heat kernel as the pagerank of a graph

نویسنده

  • Fan Chung
چکیده

The concept of pagerank was first started as a way for determining the ranking of Webpages by Web search engines. Based on relations in interconnected networks, pagerank has become a major tool for addressing fundamental problems arising in general graphs, especially for large information networks with hundreds of thousands of nodes. A notable notion of pagerank, introduced by Brin and Page and denoted by PageRank, is based on random walks as a geometric sum. In this paper we consider a new notion of pagerank which is based on the (discrete) heat kernel and can be expressed as an exponential sum of random walks. The heat kernel satisfies the heat equation and can be used to analyze many useful properties of random walks in a graph. A local Cheeger inequality is established which implies that by focusing on cuts determined by linear orderings of vertices using the heat kernel pageranks, the resulting partition is within a quadratic factor of the optimum. This is true, even if we restrict the volume of the small part separated by the cut to be close to some specified target value. This leads to a graph partitioning algorithm for which the running time is proportional to the size of the targeted volume (instead of the size of the whole graph). Introduction In the development of quantitative ranking for Webpages, many mathematical methods have come into play. The Hub-and-Authority algorithm by Kleinberg [11] uses eigenvectors. The PageRank introduced by Brin and Page [3] basically uses random walks. These pagerank algorithms mainly rely on the network structure of the Web. The viewpoint is to regard the Web as a graph, with vertices to be Webpages and edges as links between pairs of Webpages. Various notions of pagerank are computed using the Webgraph which are then used for numerous applications, such as identifying communities or finding hot spots in various information networks. Another example is to use PageRank to derive a local graph partitioning algorithm [1], which can be computed very efficiently in the sense that the cost of computing is proportional to the size of the small part of the partition, in contrast with the generic partitioning algorithm having cost depending on the size of the whole graph. In this paper, we introduce a new notion of pagerank by using the heat kernel of a graph. Similar to PageRank, the heat kernel pagerank is based on random walks but having the extra benefit of satisfying the heat equation. Originally rooted in spectral geometry [15], the heat equation for graphs involves a parameter t, the heat, which allows additional control of the rate of diffusion (see detailed definitions later). Using the heat equation, the heat kernel pagerank is amenable to various mathematical analyses of the graph. A key isoperimetric invariant of a graph is the Cheeger constant which provides an evaluation of how good a cut can be found. The classical Cheeger inequality concerns the relationship between the Cheeger constant and eigenvalues of the (normalized) Laplacian of a graph. (A graph can be viewed as a discrete version of a manifold where the original Cheeger inequality applies [4].) Here we will prove several variations of the Cheeger inequality, establishing relationships between the Cheeger constant and the heat kernel pagerank. One of the consequences of the local Cheeger inequality is that, for a given value s, the minimum Cheeger ratio of subsets of ∗Research supported in part by NSF Grants DMS 0457215 and ITR 0426858

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Local Graph Partitioning Algorithm Using Heat Kernel Pagerank

We give an improved local partitioning algorithm using heat kernel pagerank, a modified version of PageRank. For a subset S with Cheeger ratio (or conductance) h, we show that there are at least a quarter of the vertices in S that can serve as seeds for heat kernel pagerank which lead to local cuts with Cheeger ratio at most O( √ h), improving the previously bound by a factor of p log |S|.

متن کامل

Computing Heat Kernel Pagerank and a Local Clustering Algorithm

Heat kernel pagerank is a variation of Personalized PageRank given in an exponential formulation. In this work, we present a sublinear time algorithm for approximating the heat kernel pagerank of a graph. The algorithm works by simulating random walks of bounded length and runs in time O( log( ) logn 3 log log( 1) ), assuming performing a random walk step and sampling from a distribution with b...

متن کامل

Community Detection Using Time-Dependent Personalized PageRank

Local graph diffusions have proven to be valuable tools for solving various graph clustering problems. As such, there has been much interest recently in efficient local algorithms for computing them. We present an efficient local algorithm for approximating a graph diffusion that generalizes both the celebrated personalized PageRank and its recent competitor/companion the heat kernel. Our algor...

متن کامل

A Semi-supervised Heat Kernel Pagerank Mbo Algorithm for Data Classification

We present a very efficient semi-supervised graph-based algorithm for classification of high-dimensional data that is motivated by the MBO method of Garcia-Cardona (2014) and derived using the similarity graph. Our procedure is an elegant combination of heat kernel pagerank and the MBO method applied to study semi-supervised problems. The timing of our algorithm is highly dependent on how quick...

متن کامل

Four Cheeger - type Inequalities for Graph Partitioning Algorithms ∗

We will give proofs to four isoperimetric inequalities which are variations of the original Cheeger inequality relating eigenvalues of a graph with the Cheeger constant. The first is a simplified proof of the classical Cheeger inequality using eigenvectors. The second is based on a rapid mixing result for random walks by Lovász and Simonovits. The third uses PageRank, a quantitative ranking of ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007